Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Developing integrated workflows for the digitisation of herbarium specimens using a modular and scalable approach

Identifieur interne : 000266 ( Main/Exploration ); précédent : 000265; suivant : 000267

Developing integrated workflows for the digitisation of herbarium specimens using a modular and scalable approach

Auteurs : Elspeth Haston [Royaume-Uni] ; Robert Cubey [Royaume-Uni] ; Martin Pullan [Royaume-Uni] ; Hannah Atkins [Royaume-Uni] ; David J. Harris [Royaume-Uni]

Source :

RBID : PMC:3406469

Abstract

Digitisation programmes in many institutes frequently involve disparate and irregular funding, diverse selection criteria and scope, with different members of staff managing and operating the processes. These factors have influenced the decision at the Royal Botanic Garden Edinburgh to develop an integrated workflow for the digitisation of herbarium specimens which is modular and scalable to enable a single overall workflow to be used for all digitisation projects. This integrated workflow is comprised of three principal elements: a specimen workflow, a data workflow and an image workflow.

The specimen workflow is strongly linked to curatorial processes which will impact on the prioritisation, selection and preparation of the specimens. The importance of including a conservation element within the digitisation workflow is highlighted. The data workflow includes the concept of three main categories of collection data: label data, curatorial data and supplementary data. It is shown that each category of data has its own properties which influence the timing of data capture within the workflow. Development of software has been carried out for the rapid capture of curatorial data, and optical character recognition (OCR) software is being used to increase the efficiency of capturing label data and supplementary data. The large number and size of the images has necessitated the inclusion of automated systems within the image workflow.


Url:
DOI: 10.3897/zookeys.209.3121
PubMed: 22859881
PubMed Central: 3406469


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Developing integrated workflows for the digitisation of herbarium specimens using a modular and scalable approach</title>
<author>
<name sortKey="Haston, Elspeth" sort="Haston, Elspeth" uniqKey="Haston E" first="Elspeth" last="Haston">Elspeth Haston</name>
<affiliation wicri:level="1">
<nlm:aff id="A1">Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR</wicri:regionArea>
<wicri:noRegion>EH3 5LR</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Cubey, Robert" sort="Cubey, Robert" uniqKey="Cubey R" first="Robert" last="Cubey">Robert Cubey</name>
<affiliation wicri:level="1">
<nlm:aff id="A1">Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR</wicri:regionArea>
<wicri:noRegion>EH3 5LR</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Pullan, Martin" sort="Pullan, Martin" uniqKey="Pullan M" first="Martin" last="Pullan">Martin Pullan</name>
<affiliation wicri:level="1">
<nlm:aff id="A1">Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR</wicri:regionArea>
<wicri:noRegion>EH3 5LR</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Atkins, Hannah" sort="Atkins, Hannah" uniqKey="Atkins H" first="Hannah" last="Atkins">Hannah Atkins</name>
<affiliation wicri:level="1">
<nlm:aff id="A1">Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR</wicri:regionArea>
<wicri:noRegion>EH3 5LR</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Harris, David J" sort="Harris, David J" uniqKey="Harris D" first="David J" last="Harris">David J. Harris</name>
<affiliation wicri:level="1">
<nlm:aff id="A1">Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR</wicri:regionArea>
<wicri:noRegion>EH3 5LR</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">22859881</idno>
<idno type="pmc">3406469</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3406469</idno>
<idno type="RBID">PMC:3406469</idno>
<idno type="doi">10.3897/zookeys.209.3121</idno>
<date when="2012">2012</date>
<idno type="wicri:Area/Pmc/Corpus">000201</idno>
<idno type="wicri:Area/Pmc/Curation">000201</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000109</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="wicri:Area/PubMed/Corpus">000028</idno>
<idno type="wicri:Area/PubMed/Curation">000028</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000028</idno>
<idno type="wicri:Area/Ncbi/Merge">000139</idno>
<idno type="wicri:Area/Ncbi/Curation">000139</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000139</idno>
<idno type="wicri:doubleKey">1313-2989:2012:Haston E:developing:integrated:workflows</idno>
<idno type="wicri:Area/Main/Merge">000269</idno>
<idno type="wicri:Area/Main/Curation">000266</idno>
<idno type="wicri:Area/Main/Exploration">000266</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Developing integrated workflows for the digitisation of herbarium specimens using a modular and scalable approach</title>
<author>
<name sortKey="Haston, Elspeth" sort="Haston, Elspeth" uniqKey="Haston E" first="Elspeth" last="Haston">Elspeth Haston</name>
<affiliation wicri:level="1">
<nlm:aff id="A1">Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR</wicri:regionArea>
<wicri:noRegion>EH3 5LR</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Cubey, Robert" sort="Cubey, Robert" uniqKey="Cubey R" first="Robert" last="Cubey">Robert Cubey</name>
<affiliation wicri:level="1">
<nlm:aff id="A1">Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR</wicri:regionArea>
<wicri:noRegion>EH3 5LR</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Pullan, Martin" sort="Pullan, Martin" uniqKey="Pullan M" first="Martin" last="Pullan">Martin Pullan</name>
<affiliation wicri:level="1">
<nlm:aff id="A1">Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR</wicri:regionArea>
<wicri:noRegion>EH3 5LR</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Atkins, Hannah" sort="Atkins, Hannah" uniqKey="Atkins H" first="Hannah" last="Atkins">Hannah Atkins</name>
<affiliation wicri:level="1">
<nlm:aff id="A1">Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR</wicri:regionArea>
<wicri:noRegion>EH3 5LR</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Harris, David J" sort="Harris, David J" uniqKey="Harris D" first="David J" last="Harris">David J. Harris</name>
<affiliation wicri:level="1">
<nlm:aff id="A1">Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Royal Botanic Garden Edinburgh, 20a Inverleith Row, Edinburgh, EH3 5LR</wicri:regionArea>
<wicri:noRegion>EH3 5LR</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series>
<title level="j">ZooKeys</title>
<idno type="ISSN">1313-2989</idno>
<idno type="eISSN">1313-2970</idno>
<imprint>
<date when="2012">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<label>Abstract</label>
<p>Digitisation programmes in many institutes frequently involve disparate and irregular funding, diverse selection criteria and scope, with different members of staff managing and operating the processes. These factors have influenced the decision at the Royal Botanic Garden Edinburgh to develop an integrated workflow for the digitisation of herbarium specimens which is modular and scalable to enable a single overall workflow to be used for all digitisation projects. This integrated workflow is comprised of three principal elements: a specimen workflow, a data workflow and an image workflow.</p>
<p>The specimen workflow is strongly linked to curatorial processes which will impact on the prioritisation, selection and preparation of the specimens. The importance of including a conservation element within the digitisation workflow is highlighted. The data workflow includes the concept of three main categories of collection data: label data, curatorial data and supplementary data. It is shown that each category of data has its own properties which influence the timing of data capture within the workflow. Development of software has been carried out for the rapid capture of curatorial data, and optical character recognition (OCR) software is being used to increase the efficiency of capturing label data and supplementary data. The large number and size of the images has necessitated the inclusion of automated systems within the image workflow.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Beach, J" uniqKey="Beach J">J Beach</name>
</author>
<author>
<name sortKey="Blum, S" uniqKey="Blum S">S Blum</name>
</author>
<author>
<name sortKey="Donoghue, M" uniqKey="Donoghue M">M Donoghue</name>
</author>
<author>
<name sortKey="Ford, L" uniqKey="Ford L">L Ford</name>
</author>
<author>
<name sortKey="Guralnick, R" uniqKey="Guralnick R">R Guralnick</name>
</author>
<author>
<name sortKey="Mares, M" uniqKey="Mares M">M Mares</name>
</author>
<author>
<name sortKey="Thiers, B" uniqKey="Thiers B">B Thiers</name>
</author>
<author>
<name sortKey="Westneat, M" uniqKey="Westneat M">M Westneat</name>
</author>
<author>
<name sortKey="Wheeler, Q" uniqKey="Wheeler Q">Q Wheeler</name>
</author>
<author>
<name sortKey="Wiegmann, B" uniqKey="Wiegmann B">B Wiegmann</name>
</author>
<author>
<name sortKey="The Network Integrated Biocollection, Alliance" uniqKey="The Network Integrated Biocollection A">Alliance the Network Integrated Biocollection</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Beaman, Rs" uniqKey="Beaman R">RS Beaman</name>
</author>
<author>
<name sortKey="Cellinese, N" uniqKey="Cellinese N">N Cellinese</name>
</author>
<author>
<name sortKey="Heidorn, Pb" uniqKey="Heidorn P">PB Heidorn</name>
</author>
<author>
<name sortKey="Guo, Y" uniqKey="Guo Y">Y Guo</name>
</author>
<author>
<name sortKey="Green, Am" uniqKey="Green A">AM Green</name>
</author>
<author>
<name sortKey="Thiers, B" uniqKey="Thiers B">B Thiers</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Beaman, R" uniqKey="Beaman R">R Beaman</name>
</author>
<author>
<name sortKey="Macklin, Ja" uniqKey="Macklin J">JA Macklin</name>
</author>
<author>
<name sortKey="Donoghue, Mj" uniqKey="Donoghue M">MJ Donoghue</name>
</author>
<author>
<name sortKey="And Hanken, J" uniqKey="And Hanken J">J and Hanken</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Berendsohn, Wg" uniqKey="Berendsohn W">WG Berendsohn</name>
</author>
<author>
<name sortKey="Chavan, V" uniqKey="Chavan V">V Chavan</name>
</author>
<author>
<name sortKey="Macklin, Ja" uniqKey="Macklin J">JA Macklin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Berendsohn, Wg" uniqKey="Berendsohn W">WG Berendsohn</name>
</author>
<author>
<name sortKey="Seltmann, P" uniqKey="Seltmann P">P Seltmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Berents, P" uniqKey="Berents P">P Berents</name>
</author>
<author>
<name sortKey="Hamer, M" uniqKey="Hamer M">M Hamer</name>
</author>
<author>
<name sortKey="Chavan, V" uniqKey="Chavan V">V Chavan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Best, Jh" uniqKey="Best J">JH Best</name>
</author>
<author>
<name sortKey="Moen, We" uniqKey="Moen W">WE Moen</name>
</author>
<author>
<name sortKey="Neill, Ak" uniqKey="Neill A">AK Neill</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="European Commission" uniqKey="European Commission">European Commission</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Granzow De La Cerda, I" uniqKey="Granzow De La Cerda I">Í Granzow-de la Cerda</name>
</author>
<author>
<name sortKey="Beach, Jh" uniqKey="Beach J">JH Beach</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Haston, E" uniqKey="Haston E">E Haston</name>
</author>
<author>
<name sortKey="Cubey, R" uniqKey="Cubey R">R Cubey</name>
</author>
<author>
<name sortKey="Harris, Dj" uniqKey="Harris D">DJ Harris</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Heidorn, Pb" uniqKey="Heidorn P">PB Heidorn</name>
</author>
<author>
<name sortKey="Wei, Q" uniqKey="Wei Q">Q Wei</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kroes, N" uniqKey="Kroes N">N Kroes</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lafferty, D" uniqKey="Lafferty D">D Lafferty</name>
</author>
<author>
<name sortKey="Landrum, Lr" uniqKey="Landrum L">LR Landrum</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Llewellyn, C" uniqKey="Llewellyn C">C Llewellyn</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Niggeman, E" uniqKey="Niggeman E">E Niggeman</name>
</author>
<author>
<name sortKey="De Decker, J" uniqKey="De Decker J">J De Decker</name>
</author>
<author>
<name sortKey="Levy, M" uniqKey="Levy M">M Lévy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Vollmar, A" uniqKey="Vollmar A">A Vollmar</name>
</author>
<author>
<name sortKey="Macklin, Ja" uniqKey="Macklin J">JA Macklin</name>
</author>
<author>
<name sortKey="Ford, Ls" uniqKey="Ford L">LS Ford</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations>
<list>
<country>
<li>Royaume-Uni</li>
</country>
</list>
<tree>
<country name="Royaume-Uni">
<noRegion>
<name sortKey="Haston, Elspeth" sort="Haston, Elspeth" uniqKey="Haston E" first="Elspeth" last="Haston">Elspeth Haston</name>
</noRegion>
<name sortKey="Atkins, Hannah" sort="Atkins, Hannah" uniqKey="Atkins H" first="Hannah" last="Atkins">Hannah Atkins</name>
<name sortKey="Cubey, Robert" sort="Cubey, Robert" uniqKey="Cubey R" first="Robert" last="Cubey">Robert Cubey</name>
<name sortKey="Harris, David J" sort="Harris, David J" uniqKey="Harris D" first="David J" last="Harris">David J. Harris</name>
<name sortKey="Pullan, Martin" sort="Pullan, Martin" uniqKey="Pullan M" first="Martin" last="Pullan">Martin Pullan</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000266 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000266 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     PMC:3406469
   |texte=   Developing integrated workflows for the digitisation of herbarium specimens using a modular and scalable approach
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:22859881" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a OcrV1 

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024